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On the tree-transformation power of XSLT 

Willi Janssen Alexaiidr Korlyukov"*" Jan Van den Bussche* 



Abstract 



XSLT is a standard rule-based programming language for express- 
ing transformations of XML data. The language is currently in tran- 
sition from version 1.0 to 2.0. In order to understand the computa- 
, tional consequences of this transition, we restrict XSLT to its pure 

Ph ' tree-transformation capabilities. Under this focus, we observe that 

C/3 , XSLT 1.0 was not yet a computationally complete tree-transformation 

, language: every 1.0 program can be implemented in exponential time. 

A crucial new feature of version 2.0, however, which allows node sets 
over temporary trees, yields completeness. We provide a formal opera- 
I tional semantics for XSLT programs, and establish confluence for this 

5^ ' semantics. 

O 

O ' 1 Introduction 

I XSLT is a powerful rule-based programming language, relatively widely 

^ ■ used, for expressing transformations of XML data, and is developed by the 

' W3C (World Wide Web Consortium) HEl II7|. An XSLT program is run 

on an XML document as input, and produces another XML document as 
X ■ output. (XSLT programs are actually called "stylesheets", as one of their 

■ main uses is to produce stylised renderings of the input data, but we will 

continue to call them programs here.) 

The language is actually in a transition period: the current standard, 
version 1.0, is being replaced by version 2.0. It is important to understand 
what the new features of 2.0 really add. In the present paper, we focus on 
the tree-transformation capabilities of XSLT. Indeed, XML documents are 
essentially ordered, node-labeled trees. 
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Alexandr Korlyukov, who was with Grodno State University, Belarus, sadly passed 
away shortly after we agreed to write a joint paper. 
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From the perspective of tree-transformation capabilities, the most im- 
portant new feature is that of "node sets over temporary trees". We will 
show that this feature turns XSLT into a computationally complete tree- 
transformation language. Indeed, as we will also show, XSLT 1.0 was not 
yet complete in this sense. Specifically, any 1.0 program can be implemented 
within exponential time in the worst case. Some programs actually express 
PSPACE-complete problems, because we will show that any linear-space 
turing machine can be simulated by an XSLT 1.0 program. 

To put our results in context, we note that the designers of XSLT will 
most probably regard the incompleteness of their language as a feature, 
rather than a defect. Indeed, in the requirements document for 2.0, turning 
XSLT into a general-purpose programming language is explicitly stated as a 
"non-goal" p]. In that respect, our result on the completeness of 2.0 exposes 
(albeit in a narrow sense) a failure to meet the requirements! 

At this point we should be a little clearer on what we mean by "focusing 
on the tree-transformation capabilities of XSLT". As already mentioned, 
XML documents are essentially trees where the nodes are labeled by ar- 
bitrary strings. We make abstraction of this string content by regarding 
the node labels as coming from some finite alphabet. Accordingly, we strip 
XSLT of its string-manipulation functions, and restrict its arithmetic to 
arbitrary polynomial-time functions on counters, i.e., integers in the range 
{1, 2, . . . , n} with n the number of nodes in the input tree. It is, incidentally, 
quite easy to see that XSLT 1.0 without these restrictions can express all 
computable functions on strings (or integers). Indeed, rules in XSLT can be 
called recursively, and we all know that arbitrary recursion over the strings 
or the integers gives us completeness. 

We will provide a formal operational semantics for the substantial frag- 
ment of XSLT discussed in this paper. A formal semantics has not been 
available, although the W3C specifications represent a fine effort in defining 
it informally. Of course we have tried to make our formalisation faithful to 
those specifications. Our semantics does not impose an order on operations 
when there is no need to, and as a result the resulting transition relation 
is non-deterministic. We establish, however, a confluence property, so that 
any two terminating runs on the same input yield the same final result. 
Confluence was not yet proven rigorously for XSLT, and can help in pro- 
viding a formal justification for alternative processing strategies that XSLT 
implementations may follow for the sake of optimisation. 
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Figure 1: A data tree. 



2 Data model 
2.1 Data trees 

Let S be a finite alphabet, including the special label doc. By a data tree 
we simply mean a finite ordered tree, in which the nodes are labeled by 
elements of S. Up to isomorphism, we can describe a data tree t by a string 
string (t) over the alphabet S extended with the two symbols { and }: if the 
root of t is labeled a and its sequence of top-level subtrees is ti, . . . , t^, then 



Thus, for the data tree shown in Figure ^ the string representation equals 



A data forest is a finite sequence of data trees. Forests arise naturally 
in XSLT, and for uniformity reasons we need to be able to present them as 
data trees. This can easily be done as follows: 

Definition 1 (maketree). Let F be a data forest. Then maketree{F) is 
the data tree obtained by affixing a root node on top of F, and labeling this 
root node with doc.^ 

2.2 Stores and values 

Let T he a supply of tree variables, including the special tree variable Input. 
We define: 

^The root node added by maketree models what is called the "document root" in the 
XPath data model '6 , although we do not model it entirely faithfully, as we do not formally 
distinguish "document nodes" from "element nodes". This is only for simplicity; it is no 
problem to incorporate this distinction in our formalism, and our technical results do not 
depend on our simplification. 



string (t) = a.{string (ti) . . . string (t i^)} 



a{b{}c{a{}b{}}c{». 
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Definition 2. A store is a finite set S of pairs of the form {x,t), where 
X (z T and t is a data tree, such that (1) Input occurs in S; (2) no tree 
variable occurs twice in S; and (3) ah data trees occurring in S have disjoint 
sets of nodes. 

The tree assigned to Input is cahed the input tree; the other trees are 
cahed the temporary trees. 

Definition 3. A value over S is a finite sequence consisting of nodes from 
trees in S, and counters over S. Here, a counter over S is an integer in the 
range {1, 2, . . . , n}, where n is the total number of nodes in S. 

Values as defined above formalise the kind of values that can be returned 
by XPath expressions. XPath ^ |S] is a language that is used as a sublan- 
guage in XSLT for the purpose of selecting nodes from trees. But XPath 
expressions can also return numbers, which is useful as an aid in making 
node selections (e.g., the i-th child of a node, or the i-th node of the tree in 
preorder). We limit these numbers to counters, in order to concentrate on 
pure tree transformations. 

3 XPath abstraction 

Since the language XPath is already well understood |2Z1 ^1 IB ' ™d 
its study in itself is not our focus, we will work with an abstraction of 
XPath, which we denote by X. For our purposes it will suffice to divide the 
A'-expressions in only two different types, which we denote by nodes and 
mixed. A value is of type nodes if it consists exclusively of nodes; otherwise 
it is of type mixed. 

In order to define the semantics of X, we need some definitions, which 
refiect those from the XPath specification. Let V be a supply of value 
variables, disjoint from T. 

Definition 4. An environment over S is a finite set E of pairs of the form 
where x G V and f is a value over S, such that no value variable 
occurs twice in E. 

Definition 5. A context triple over S is a triple {z,i,k) where z is a node 
from S or a counter over S, and i and k are counters over S such that i ^ k. 
We call z the context item, i the context position, and k the context size. 

Definition 6. A context is a triple (S,E, c) where S is a store, E is an 
environment over S, and c is a context triple over S. 



4 



If we denote the universe of all possible contexts by Contexts, the se- 
mantics of X is now given by a partial function eval on X x Contexts, such 
that whenever defined, eval{e,C) is a value over C's store, and this value 
has the same type as e. 

Remark 3.1. A static type system, based on XML Schema jHHni) can be 
put on contexts to ensure definedness of expressions [7], but we omit that 
as safety is not the focus of the present paper. □ 

In general we do not assume much from X, except for the availability of 
the following basic expressions, also present in real XPath: 

• An expression '/*', such that eval{/*,C) equals the root node of the 
input tree in C's store. 

• An expression 'child : : *', such that eua/(child : : *, C) is defined when- 
ever Cs context item is a node n, and then equals the list of children 
of n. 

4 Syntax 

In this section, we define the syntax of a sizeable fragment of XSLT 2.0. The 
reader familiar with XSLT will notice that we have simplified and cleaned 
up the language in a few places. These modifications are only for the sake 
of simplicity of exposition, and our technical results do not depend on them. 
We discuss our deviations from the real language further in Section f5. 51 

Also, the concrete syntax of real XSLT is XML-based and rather un- 
wieldy. For the sake of presentation, we therefore give a syntax of our own, 
which is non-XML, but otherwise follows the same lines as the real syntax. 

The grammar is shown in Figure [21 The only typing condition we need 
is that in an apply-statement or in a vcopy-statement, expr must be of type 
nodes. Also, no two different rules can have the same name, and the name 
in a call-statement must be the name of some rule. 

We will often identify a template M with its syntax tree. This tree con- 
sists of all occurrences of statements in M and represents how they follow 
each other and how they are nested in each other; we omit the formal defi- 
nition. Observe that only cons-, foreach-, tree-, and if-statements can have 
children. Note also that, since a template is a sequence of statements, the 
syntax "tree" is actually a forest, i.e., a sequence of trees, but we will still 
call it a tree. 
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Program Rule* 

Rule template name match expr (mode name)? { Template } 

Template — > Statement* 

Statement — > cons label { Template } 

I apply expr (mode name)? 

I call name 

I foreach expr { Template } 

I val value-variable expr 

I tree tree-variable { Template } 

I vcopy expr 

I tcopy tree-variable 

I if expr { Template } else { Template } 

Figure 2: Our syntax. The terminal symbol expr stands for an X- 
expression; label stands for an element of our alphabet S; value-variable 
and tree-variable stand for elements of V and T, respectively; and name 
is self-explanatory. As usual we use * to denote repetition, ? to denote 
optionality, and use ( and ) for lexical grouping. 

Variable definitions happen through val- and tree-statements. We will 
need the notion of a statement being in the scope of some variable definition; 
this is defined in the standard way as follows. 

Definition 7. Let M be a template, and let 5i and 52 be two statements 
occurring in M. We say that 52 is in the scope of Si if 52 is a right sibling 
of 5i in the syntax tree of M, or a descendant of such a right sibling. An 
illustration is in Figure El 

One final definition: 

Definition 8. Template M' is called a subtemplate of template M if M' 
consists of a sequence of consecutive sibling statements occurring in M. 

5 Operational semantics 

Fix a program P and a data tree t. We will describe the semantics of P on 
input t as a rewrite relation =^ among configurations. 

Definition 9. A configuration consists of a template M together with a 
partial function that assigns a context to some of the statements of M (more 
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Figure 3: Depiction of a syntax tree. The nodes in the scope of the black 
node are those that are striped. 

precisely, the nodes of its syntax tree). The statements that have a context 
are called active; we require that the descendants of an inactive node are 
inactive too. Cons-statements are never active. 

We use the following notation concerning configurations: 

• 5 < 7 denotes that S is a statement occurring in the template of 
configuration 7. 

• If 5 < 7, then 7(6') = C denotes that S is active in 7, having context 
C. 

• If M is a subtemplate of a configuration 7, then M itself can be taken 
as a configuration by inheriting all the context assignments done by 
7. We call such a configuration a subconfiguration. 

• If M is a subconfiguration of 7, and 7' is another configuration, then 
7(M 7') denotes the configuration obtained from 7 by replacing M 
by 7'. 

The initial configuration is defined as follows. 
Definition 10. 1. The initial context equals 

({(Input, t)}, 0, (r,l,l)) 
where r is the root of t. 

2. The initial template equals the single statement 'apply /*'. 

3. The initial configuration consists of the initial template, whose single 
statement is assigned the initial context. 
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• S =\i e { Mtrue } else { Mfalse } < 7 

7(5) = C 
eval{e, C) ^ 

7 4 7(5 ^ Mfaue) 

• 5 = if e { Mtrue } else { Mfaise } < 7 
7(5) = C 

eval{e, C) = 

7 4 7(5 ^ Mf,i,e) 
Figure 4: Semantics of if-statements; denotes the empty sequence. 

The goal will be to rewrite the initial configuration into a terminal tem- 
plate; this is a configuration consisting exclusively of cons-statements. Ob- 
serve that terminal templates can be viewed as data forests; indeed, simply 
by removing the cons's from a terminal template, we obtain the string rep- 
resentation of a data forest. 

For the rewrite relation =^ we are going to define, terminal configurations 
will be normal forms, i.e., cannot be rewritten further. If, for two configura- 
tions 7o and 71 , we have 70 =^ • • • =^ 7i and 71 is a normal form, we denote 
that by 70 71. The relation =^ will be defined in such a way that if 70 is 
the initial configuration and 70 71, then 71 will be terminal. Moreover, 
we will prove in Theorem ^ that each configuration 70 has at most one such 
normal form 71. We thus define: 

Definition 11. Given P and t, let 70 be the initial configuration and let 
7o 7i- Then the final result tree of applying P to t is defined to be 
maketree (71 ) . 

In the above definition, we can indeed apply maketree, defined on data 
forests (Definition^, to 71, since 71 is terminal and we just observed that 
terminal templates describe forests. Note that the final result tree is only 
determined up to isomorphism. 

5.1 If-statements 

If-statements are the only ones that generate control flow, so we treat them 
by a separate rewrite relation =l>, defined by the semantic rules shown in 
Figure EJ 
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It is not difficult to show that =^ is terminating and locally confluent, 
whence confluent, so that every configuration has a unique normal form 

w.r.t. =4> |26j . This normal form no longer contains any active if-statements. 
(Quite obviously, the most efficient way to get to this normal form is to work 

out the if-statements top-down.) We write 7 4* ' 7' to denote that 7' is the 

normal form of 7 w.r.t. =4>. 

Remark 5.1. Our main rewrite relation =^ is not terminating in general. 
The reason why we treat if-statements separately is to avoid nonsensical 
rewritings such as where we execute a non-terminating statement in the 
else-branch of an if-statement whose test evaluates to true. 

5.2 Apply-, call-, and foreach-statements 

For the semantics of apply-statements, we need the following definitions. 

Definition 12 (ruletoapply). Let C be a context, let n be a node, and let 
m be a name. Then ruletoapply (C , n) (respectively, ruletoapply {C,n,m)) 
equals the template belonging to the first rule in P (respectively, with mode 
name equal to m) whose expr satisfies n € eval{expr,C). 

If no such rule exists, both ruletoapply {C , n) and ruletoapply {C , n, m) 
default to the single-statement template 'apply child: 

Definition 13 (init). Let M be a template, and let C be a context. Then 
init{M,C) equals the configuration obtained from M by assigning context 
C to every statement in M, except for all statements in the scope of any 
variable definition, and all statements that are below a foreach-statement; 
all those statements remain inactive. 

We are now ready for the semantic rule for apply-statements, shown in 
Figure ISI We omit the rule for an apply-statement with a mode m: the only 
difference with the rule shown is that we use ruletoapply {ni,C,m). 

The semantic rule for foreach-statements is very similar to that for apply- 
statements, and is also shown in Figure [3 

For call-statements, we need the following definition. 

Definition 14 (rulewithname). For any name, let rulewithname{name) 
denote the template of the rule in P with that name. 

The semantic rule for a call-statement is then again shown in Figure El 
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• S = apply e < 7 

7(5) = C = (S,E,c) 
eval{e,C) = (ni,...,nfc) 
ruletoapply{ni,C) = Mi ioi i = 1, . . . , k 
init{Mi, (S,E, (nj,i, /c))) = 7^ for i = 1,... ,k 

7(5 ^ 71 ■ ■ ■ 7fc) 4> ' 

7^7' 

• S = foreach e { M } < 7 

7(5)=C = (S,E,c) 
eval{e,C) = {zi,...,Zk) 

init{M, (S, E, (zj, i, k))) = 7j for i = 1, . . . , fc 

7(5 ^ 71 • • - 7^) =» ' 7^ 

7^7' 

• 5 = call name < 7 
7(5) = C 

rulewithname{name) = M 
init{M, C) = 71 

7(g ^ 71) ^ ' y 

7^7' 



Figure 5: Semantics of apply-, call-, and foreach-statements. 
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• S' = vala;e<l7 
7(5) = C 

C{x: eval{e,C)) = C 
updateset{'y, S) = M 
imt{M,C') = 71 

7(5M ^ 7i) 4 ' y 

• S = tree y -[ M > < 7 

M is terminal 
7(5) = C 

C{y: maketree{M)) = C 
updateset{'y, S) = M' 
imt{M',C') = -f3 

jjSM' ^ 73) 4 ' y 
7^7' 

Figure 6: Semantics of variable definitions. 
5.3 Variable definitions 

For a context C = (S, E, c), a value variable x, a value a tree variable y, 
and a data tree t, we denote by 

• C{x: v) the context obtained from C by updating E with the pair 
{x,v); and by 

• C{y: t) the context obtained from C by updating S with the pair 

(y,t). 

We also define: 

Definition 15 (updateset). Let 7 be a configuration and let 5 <l 7. Let 

M be the template underlying 7. Let Si, . . . , Sk be the right siblings of 5 
in M, in that order. Let j be the smallest index for which 5^ is active in 
7; if all the Si are inactive, put j = k + 1. Then the template Si . . . Sj-i is 
denoted by updateset{^, S). If j = 1 then this is the empty template. 

We are now ready for the semantic rules for variable definitions, shown 
in Figure El 
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ttem]9 (/oresi ( (n4, rii, n2, n3, ni), S)) = cons c { cons a {> cons b {} } 

cons b {> 

cons c { cons a {} cons b {} } 



5.4 Copy-statements 

The following definitions are illustrated in Figure [3 

Definition 16 (forest). Let S be a store, and let (ni, . . . , n^) be a sequence 
of nodes from S. For i = 1, . . . , /c, let tj be the data subtree rooted at nj. 
Then forest{{ni, . . . ,nfc), S) equals the data forest (ti, . . . ,t„). 

Definition 17 (ttemp). Let F be a data forest. Then ttemp{F) equals 
the terminal template describing F. 

We also need: 

Definition 18 (choproot). Let t be a data tree with top-level subtrees 
ti, . . . , tfc, in that order. Then choproot(t) equals the data forest (ti, . . . , t^). 

The semantic rules for copy-statements are now shown in Figure |S1 

5.5 Discussion 

The final result of applying P to t (Definition II Ij) may be undefined for two 
very different reasons. The first, fundamental, reason is that the rewriting 
may be nonterminating. The second reason is that the rewriting may abort 
because the evaluation of an ^-expression is undefined, or the tree variable in 
a tcopy-statement is not defined in the store. This second reason can easily 
be avoided by a type system on X, as already mentioned in Remark 13.11 
together with scoping rules to keep track of which variables are visible in 
the XSLT program and which variables are used in the ^^-expressions. Such 



cons a {> 
cons b {} 



Figure 7: Illustration of Definitions and 1171 
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• S = vcopy e < 7 

7{S)=C = {S,B,c) 
eval{e,C) = (ni, . . . ,nfc) 
^temp(/orest((ni, ■ ■ ■ , rifc), S)) =M 

7 ^ ^(5 ^ M) 

• S = tcopy y < 7 

7(5) = (S,E,c) 

(y,t) G s 

ttemp{choproot{t)) = M 
7 ^ 7(5 ^ M) 

Figure 8: Semantics of copy-statements. 

scoping rules are entirely standard, and indeed are implemented in the XSLT 
processor SAXON [TB] . 

In the same vein, we have simplified the parameter passing mechanism 
of XSLT, and have omitted the feature of global variables. On the other 
hand, our mechanism for choosing the rule to apply fDefinitionll2|) is more 
powerful than the one provided by XSLT, as ours is context-dependent. It 
is actually easier to define that way. As already mentioned at the beginning 
of Section \^ none of our technical results depend on the modifications we 
have made. 

Finally, we note that the XSLT processor SAXON evaluates variable 
definitions lazily, whereas we simply evaluate them eagerly. Again, lazy 
evaluation could have been easily incorporated in our formalism. Some 
programs may terminate on some inputs lazily, while they do not terminate 
eagerly, but for programs that use all the variables they define there is no 
difference. 

5.6 Confluence 

Recall that we call a rewrite relation confluent if, whenever we can rewrite 
a configuration 71 to 72 as well as to 73, then there exists 74 such that 
we can further rewrite both 72 and 73 into 74. Confluence guarantees that 
all terminating runs from a common configuration also end in a common 
configuration 26 . Since, for our rewrite relation either all runs on some 
input are nonterminating, or none is, the following theorem implies that the 
same final result of a program P on an input t, if defined at all, will be 



13 



obtained regardless of the order in which we process active statements. 
Theorem 1. Our rewrite relation =^ is confluent. 

Proof. The proof is a very easy apphcation of a basic theorem of Rosen 
about subtree replacement systems A subtree replacement system TZ 
is a (typically infinite) set of pairs of the form <j) ^ ip, where <j) and ip 
are descriptions up to isomorphism of ordered, node-labeled trees, where 
the node labels come from some (again typically infinite) set V. Let us 
refer to such trees as F-trees. Such a system TZ naturally induces a rewrite 
system =^7^ on y-trees: we have t =^7^ t' if there exists a node n of t 
and a pair (j) ^ ip in TZ such that the subtree t/n is isomorphic to <j), and 
t' = t(n <— ip). Here, we use the notation t/n for the subtree of t rooted at 
n, and the notation t{n <— ip) for the tree obtained from t by replacing t/n 
by a fresh copy of ip. Rosen's theorem states that if TZ is "unequivocal" and 
"closed", then is confiuent. 

"Unequivocal" means that for each 4> there is at most one ip such that 
<j) ^ ip is in TZ. The definition of TZ being "closed" is a bit more complicated. 
To state it, we need the notion of a residue map from (p to ip. This is a 
mapping r from the nonroot nodes of (p to sets of nonroot nodes of ip, such 
that for m G r(n) the subtrees (p/n and tp/m are isomorphic. Moreover, if 
ni and n2 are independent (no descendants of each other), then all nodes in 
r(ni) must also be independent of all nodes in r(n2). 

Now TZ being closed means that we can assign a residue map r[(p,ip] 
to every — > V i^i iii such a way that for any cpQ — > ipQ in TZ, and 
any node n of <po, if there exists a pair (^o/"- ~^ in ^) then the pair 
(po{n ^ Ip) 'ipQ{r[(pQ, V'o]("-) ^ V') is also in TZ. Denoting the latter pair by 
(pi — > Vi) we must moreover have for each node p of (pQ that is independent 
of n, that r[(pi,'ipi]{p) = r[(pQ,ipQ]{p). 

To apply Rosen's theorem, we view configurations (Definitional as V- 
trees, where V = Statements U {Statements x Contexts). Here, Statements 
is the set of all possible syntactic forms of statements. So, given a configu- 
ration, we take the syntax tree of the underlying template, and label every 
inactive node by its corresponding statement, and every active node by its 
corresponding statement and its context in the configuration. (Since tem- 
plates are sequences, we actually get T/- forests rather than l/-trees, but that 
is a minor fuss.) 

Now consider the subtree replacement system TZ consisting of all pairs 
7 — > 7' for which 7 =^ 7' as defined by our semantics, where 7 consists 
of a single statement So, and the active statement being processed to get 
7' is a direct child of Sq. Since our semantics always substitutes siblings 
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for siblings, it is clear that =^7^ then coincides with our rewrite relation =^. 
Since the processing of every individual statement is always deterministic 
(up to isomorphism of trees), TZ as just defined is clearly unequivocal. 

We want to show that TZ is closed. Thereto, we define residue maps 
r[7, 7'] as follows. 

The case where 7 — > 7' is the processing of an apply- or call-statement, 
is depicted in Figure IHl (top). The node being processed is shown in black. 
The subtemplates to the left and right are left untouched. Referring to the 
notation used in Figure El the newly substituted subtemplate 7new is such 
that 71 • . . 7fc ' 7new (for apply) or 71 =4^ ' 7new (for call). Indeed, since we 
apply => ■ at the end of every processing step, 7 itself does not contain any 
active if-statements. We define r = ^[7, 7'] as follows: 

• For nodes n in 7icft or 7right) we put r{n) := {n'}, where n' is the 
corresponding node in 7'. 

• For the black node b, we put r{b) := 0. 

The main condition for closedness is clearly satisfied, because statements 
can be processed independently. Note that the black node has no children, 
let alone active children, which allows us to put r{b) = 0. The condition on 
p's is also satisfied, because both r[(f>Q,^pQ] and r[(pi,tpi] will set r{p) to {p'}. 

The case where 7 — > 7' is the processing of a foreach-statement is de- 
picted in Figure ini (middle). This case is analogous to the previous one. 
The only difference is that the black node now has descendants (M in the 
figure). Because the init function (Definition I13j) always leaves descendants 
of a foreach node inactive, however, the nodes in M are inactive at this time, 
and we can put r(n) := for all of them. 

The case where 7 — > 7' is the processing of a val-statement is depicted 
in Figure ini (bottom). Since all nodes in the update set are inactive by 
definition (Definitional)) we can again put r(n) := for all nodes in the 
update set. The case of a tree-statement is similar; now the black node 
again has descendants, but again these are all inactive (they are all cons- 
statements). The case where 7 — > 7' is the processing of a copy-statement, 
finally, is again analogous. □ 

6 Computational completeness 

As defined in Definition llll an XSLT program P expresses a partial function 
from data trees to data forests, where the output forest is represented by a 
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Figure 9: Illustration to the proof of Theorem ^ 



tree by affixing a root node labeled doc on top (Definition ^) . The output 
is defined up to isomorphism only, and P does not distinguish between 
isomorphic inputs. This leads us to the following definition: 

Definition 19. A tree transformation is a partial function from data trees 
to data trees with root labeled doc, mapping isomorphic trees to isomorphic 
trees. 

Using the string representation of data trees defined in Section 12.11 we 
further define: 

Definition 20. A tree transformation / is called computable if the string 
function /: string{t) ^ string{f{t)) is computable in the classical sense. 

Up to now, we have assumed from our XPath abstraction X only the 
availability of the expressions '/*' and 'child: :*'. For our proof of the 
following theorem, we need to assume the availability of a few more very 
simple expressions, also present in real XPath: 

• y/*i for any tree variable y, evaluates to the root of the tree assigned 
to y. 
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• //* evaluates to the sequence of all nodes in the store (it does not 
matter in which order). 

• child: :*[1] evaluates to the first child of the context item (which 
should be a node). 

• following-sibling: :*[1] evaluates to the immediate right sibling 
of the context node, or the empty sequence if the context node has no 
right siblings. 

• Increment, decrement, and test on counters: the constant expression 
'1', and the expressions 'x+l', and 'a;=l' for any value variable x, 
which should consist of a single counter. If x has the maximal counter 
value, then x+1 need not be defined, and if x has value 1, then x-1 
need not be defined. The test x=l yields any nonempty sequence for 
true and the empty sequence for false. 

• name() = 'a', for any a G S, returning any nonempty sequence if the 
label of the context node is a, and the empty sequence otherwise. 

• () evaluates to the empty sequence. 
We establish: 

Theorem 2. Every computable tree transformation f can be realised by a 
program. 

Proof. We can naturally represent any string s over some finite alphabet as a 
flat data tree over the same alphabet. We denote this flat tree by flattree{s). 
Its root is labeled doc, and has k children, where k is the length of s, such 
that the labels of the children spell out the string s. There are no other 
nodes. 

The proof now consists of three parts: 

1. Program the transformation t i-^- flattree {string {t)). 

2. Show that every turing machine (working on strings) can be simulated 
by some program working on the fiattree representation of strings. 

3. Program the transformation fiattree {string {t)) ^ t. 

The theorem then follows by composing these three steps, where we simu- 
late a turing machine for / in step 2. Note that the composition of three 
programs can be written as a single program, using a temporary tree to 
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template tree2striiig match (//*) 
{ 

cons a { > 
cons Ibrace { } 
apply (child: : *) 
cons rbrace { } 

> 

Figure 10: From t to flattree {string (t)). 

pass the intermediate results, and using modes to keep the rules from the 
different programs separate. 

The programs for steps 1 and 3 are shown in Figures and 1111 For 
simplicity, they are for an alphabet consisting of a single letter a, but it is 
obvious how to generalise the programs. The real XSLT versions are given in 
the Appendix. We point out that these programs are actually 1.0 programs, 
so it is only for step 2 of the proof that we need XSLT 2.0. 

For step 2, we can represent a configuration of a turing machine A by two 
temporary trees left and right. At each step, variable right holds (as a 
flat tree) the content of the tape starting at the head position and ending in 
the last tape cell; variable left holds the reverse of the tape portion left of 
the head position. To keep track of the current state of the machine, we use 
value variables q for each state q of A, such that at each step precisely one 
of these is nonempty. (This is why we need the A'-expression () .) Changing 
the symbol under the head to an a amounts to assigning a new content 
to right by putting in cons a {}, followed by copies of the nodes in the 
current content of right, where we skip the first one. Moving the head a 
cell to the right amounts to assigning a new content to left by putting in a 
node labeled with the current symbol, followed by copies of the nodes in the 
current content of left. We also assign a new content to right in the now 
obvious way; if we were at the end of the tape we add a new node labeled 
blank. Moving the head a cell to the left is simulated analogously. The only 
Af-expressions we need here are the ones we have assumed to be available. 

The simulation thus consists of repeatedly calling a big if-then-else that 
tests for the transition to be performed, and performs that transition. We 
may assume A is programmed in such a way that the final output is produced 
starting from a designated state. In this way we can build up the final output 
string in a fresh temporary tree and pass it to step 3. □ 
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template doc match (/*) 
{ 

apply (child: :*[!]) 

} 

template string2tree match (//*) 
{ 

cons a 

{ apply (following-sibling: :*[!]) mode dochildren } 

val counter (1) 

call searchnextsibling 

} 

template dochildren match (//*) mode dochildren 
{ 

if name ( ) = ' Ibrace ' 

{ apply (following-sibling: :*[!]) mode dochildren } 
else { 

if name() = 'a' 

{ call string2tree > 

else -[ > 

> 

} 

template searchnextsibling match (//*) mode search 
{ 

if name () = ' Ibrace' { 
val counter (counter + l) 

apply (following-sibling: :*[1]) mode search 

} 

else { 

if naine() = 'a' 

{ apply (following-sibling: :*[!]) mode search } 
else { 

val counter (counter - l) 
if counter = 1 

{ apply (following-sibling: :*[!]) 
mode dochildren } 

else 

{ apply (following-sibling: :*[1]) mode search} 

} 

> 

} 

Figure 11: From flattree {string (t)) to t. 
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7 XSLT 1.0 



In this section we will show that every XSLT 1.0 program can be imple- 
mented in exponential time, in sharp contrast to the computational com- 
pleteness result of the previous section. 

A fundamental difference between XSLT 1.0 and 2.0 is that in 1.0, X- 
expressions are "input-only" , defined as follows. 

Definition 21. 1. Let C = (S, E, (z, i, A;)) be a context. Let the input 
tree in S be t. Then we call C input- only if every value appearing in 
E is already a value over the store {(Input, t)}, and also {z, i, k) is like 
that. 

2. By (7, we mean the context ({(Input, t)}, E, {z,i,k)). So, C equals C 
where we have removed all temporary trees. 

3. Now an A'-expression e is called input-only if for any input-only context 
C for which eval{e,C) is defined, we have eval{e,C) = eval{e,C), and 
this must be a value over C's input tree only. 

In other words, input-only expressions are oblivious to the temporary 
trees in the store; they only see the input tree. 
We further define: 

Definition 22. An input-only A'-expression e is called polynomial if for 
each input-only context C, the computation of eval{e,C) can be done in 
time polynomial in the size of C's input tree. 

We now define: 

Definition 23. A program is called 1.0 if it only uses input-only, polynomial 
A:'-expressions. 

Essentially, 1.0 programs cannot do anything with temporary trees ex- 
cept copy them using tcopy statements. We note that real XPath 1.0 ex- 
pressions are indeed input-only and polynomial; actually, real XPath 1.0 
is much more restricted than that, but for our purpose we do not need to 
assume anything more. 

In order to establish an exponential upper bound on the time-complexity 
of 1.0 programs, we cannot use an explicit representation of the output tree. 
Indeed, 1.0 programs can produce result trees of size doubly exponential 
in the size of the input tree. For example, using subsets of input nodes, 
ordered lexicographically, as depth counters, we can produce a full binary 
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Figure 12: Left, a data tree, and right, a DAG representation of it. 

tree of depth 2" from an input tree with n nodes. Obviously a doubly 
exponentially long output could never be computed in singly exponential 
time. 

We therefore use a DAG representation of trees: an old and well-known 
trick [22j that is also used in tree transduction and that has recently 
found new applications in XML UUj. Formally, a DAG representation is a 
collection Q of trees, where trees in Q can have special leafs which are not 
labeled, and from which a pointer departs to the root of another tree in 
Q. On condition that the resulting pointer graph is acyclic, starting from 
a designated "root tree" in Q we can naturally obtain a tree by unfolding 
along the pointers. An illustration is shown in Figure IT^ 

We establish: 

Theorem 3. Let P be an 1.0 program. Then the following problem is solv- 
able in exponential, i.e., 2"'°'^' time: 

Input: a data tree t 

Output: a DAG representation of the final result tree of applying P to t, 
or a message signaling non-termination if P does not terminate on t. 

Proof. We will generate a DAG representation Q by applying modified ver- 
sions of the semantic rules from Section [21 We initialise Q with all the 
subtrees of t. These trees have no pointers. Each tree that will be added 
to Q will be a configuration, which still has to be developed further into a 
final data tree with pointers, using the same modified rules. Because we 
will have to point to the newly added configurations later, we identify each 
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added configuration by a pair (name, C) where name is the name of a tem- 
plate rule in P and C is a context. In the description below, whenever we 
say that we "add" a configuration to Q, identified by some pair {name,C), 
we really mean that we add it unless a configuration identified by that same 
pair already exists in Q. 

The modifications are now the following. 

1. When executing an apply-statement, we do not directly insert copies 
of the templates belonging to the rules that must be applied (the 7i's 
in Figure IHI). Rather, we add, for i = 1, . . . , A;, the configuration 7^' to 

Q, where 7^ =l> ' We identify 7^' by the pair {namci, Ci), with namei 
the name of the rule 7^ comes from, and Ci = (S, E, (rij, i. A;)) using 
the notation of Figure El Moreover, in place of the apply-statement 
we insert a sequence of k pointer nodes pointing to (namei, Ci), . . . , 
{namek,Ck), respectively. 

2. When executing a call-statement call name under context C, we again 
do not insert 71 (compare Figure ^ , but add the configuration j[ to 

Q, where 71 =l> ' 7J, and identify it by the pair {name,C). We then 
replace the statement by a pointer node pointing to that pair. 

3. By making template rules from the bodies of all foreach-statements in 
P, we may assume without loss of generality that the body of every 
foreach-statement is a single call-statement. A foreach-statement is 
then processed analogously to apply- and call-statements. 

4. As we did with foreach-statements, we may assume that the body of 
each tree-statement is a single call-statement. When executing a tree- 
statement, we may assume that the call-statement has already been 
turned into a pointer to some pair (nameo,Co). We then assign that 
pair directly to y in the new context C (compare Figure IH)); we no 
longer apply maketree. 

So, in the modified kind of store we use, we assign name-context pairs, 
rather than fully specified temporary trees, to tree variables. 

5. Correspondingly, when executing a statement tcopy y, we now directly 
turn it into a pointer to the pair assigned to y. 

6. Finally, when executing a vcopy-statement, we do not insert the whole 
forest generated by (ni,...,n^.) in the configuration (compare Fig- 
urelSl), but merely insert a sequence of k pointers to the input subtrees 
rooted at ni , . . . , , respectively. 
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We initiate the generation of Q by starting with the initial configuration 
as always. Processing that configuration will add the first tree to Q, which 
serves as the root tree of the DAG representation. When all trees in Q 
have been fully developed into data trees with pointer nodes, the algorithm 
terminates. In case P does not terminate on t, however, that will never 
happen, and we need a way to detect nontermination. 

Thereto, recall that every context consists of an environment E and a 
context triple c on the one hand, and a store S on the other hand. Since 
all -Y-expressions used are input-only, and thus oblivious to the store-part 
of a context (except for the input tree, which does not change), we are 
in an infinite loop from the moment that there is a cycle in G's pointer 
graph where we ignore the store-part of the contexts. More precisely, this 
happens when from a pointer node in a tree identified by {name,Ci) we 
can follow pointers and reach a pointer to a pair (name, C2) with the same 
name and where Ci and C2 are equal in their (E, c)-parts. As soon as we 
detect such a cycle, we terminate the algorithm and signal nontermination. 
Note that thus the algorithm always terminates. Indeed, since only input- 
only A'-expressions are used, all contexts that appear in the computation 
are input-only, and there are only a finite number of possible (E, c)-parts of 
input-only configuration over a fixed input tree. 

Let us analyse the complexity of this algorithm. Since all A'-expressions 
used are polynomial, there is a natural number K such that each vahic that 
appears in a context is at most long, where n equals the number of nodes 
in t. Each element of such a length-n^ sequence is a node or a counter over t, 
so there are at most (2n)" different values. There are a constant ci number 
of different value variables in P, so there are at most {{2n)^ Y'^ different 
environments. Likewise, the number of different context triples is (2n)^, 

so, ignoring the stores, there are in total at most (2n)^ • (2n)'^i"'^ ^ 2"^ 

different contexts, for some natural number K' ^ K. With a constant C2 

number of different template names in P, we get a maximal number of 
k' 

022^ different configurations that can be added to Q before the algorithm 
will surely terminate. 

It remains to see how long it takes to fully rewrite each of those configu- 
rations into a data tree with pointers. A configuration initially consists of at 
most a constant C3 number of statements. The evaluation of Af-expressions, 
which are polynomial, takes at most c^n^ time in total. Processing an 
apply- or a foreach-statement takes at most can^ modifications to the con- 
figuration and to Q; for the other statements this takes at most C3 such 
operations. Each such operation, however, involves the handling of con- 



23 



texts, whose stores can become quite large if treated naively. Indeed, tree- 
statements assign a context to a tree variable, yielding a new context which 
may then again be assigned to a tree variable, and so on. To keep this 
under control, we do not copy the contexts literally, but number them con- 
secutively in the order they are introduced in ^. A map data structure keeps 
track of this numbering. The stores then consist of an at most constant C4 
number of assignments of pairs (name, context number) to tree variables. 

As there are at most 2"^ different contexts, each number is at most n^' 
bits long. Looking up whether a given context is already in Q, and if so, 
finding its number, takes 0(log2"^ ) = 0{n^') time using a suitable map 
data structure. 

k' 

We conclude that the processing of Q takes a total time of C22" 



A legitimate question is whether the complexity bound given by Theo- 
rem Ol can still be improved. In this respect we can show that, even within 
the limits of real XSLT 1.0, any linear-space turing machine can be simu- 
lated by a 1.0 program. Note that some PSPACE-complete problems, such 
as QBF-SAT j21| . are solvable in linear space. This shows that the time 
complexity upper bound of Theorem 13] cannot be improved without showing 
that PSPACE is properly included in EXPTIME (a famous open problem). 

The simulation gets as input a flat tree representing an input string, and 
uses the n child nodes to simulate the n tape cells. For each letter a of the 
tape alphabet, a value variable cellg^ holds the nodes representing the tape 
cells that have an a. A value variable head holds the node representing the 
cell seen by the machine's head. The machine's state is kept by additional 
value variables stateq for each state q, such that statCq is nonempty iff the 
machine is in state q. Writing a letter in a cell, moving the head left or right, 
or changing state, are accomplished by easy updates on the value- variables, 
which can be expressed by real XPath 1.0 expressions. Choosing the right 
transition is done by a big if-then-else statement. Successive transitions 
are performed by recursively applying the simulating template rule until a 
halting state is reached. 

Remark 7.1. A final remark is that our results imply that XSLT 1.0 is not 
closed under composition. Indeed, building up a tree of doubly exponen- 
tial size (as we already remarked is possible in XSLT 1.0), followed by the 
building up of a tree of exponential size, amounts to building up a tree of 
triply exponential size. If that would be possible by a single program, then 
a DAG representation of a triply exponentially large tree would be com- 
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putable in singly exponential time. It is well known, however, that a DAG 
representation cannot be more than singly exponentially smaller than the 
tree it represents. Closure under composition is another sharp contrast be- 
tween XSLT 1.0 and 2.0, as the latter is indeed closed under composition as 
already noted in the proof of Theorem |21 

8 Conclusions 

W3C recommendations such as the XSLT specifications are no Holy scrip- 
tures. Theoretical scrutinising of W3C work, which is what we have done 
here, can help in better understanding the possibilities and limitations of 
various newly proposed programming languages related to the Web, even- 
tually leading to better proposals. 

A formalisation of the full XSLT 2.0 language, with all the dirty details 
both concerning the language itself as concerning the XPath 2.0 data model, 
is probably something that should be done. We believe our work gives a clear 
direction how this could be done. 

Note also that XSLT contains a lot of redundancies. For example, 
foreach-statements are eliminable, as are call-statements, and the match 
attribute of template rules. A formalisation such as ours can provide a 
rigorous foundation to prove such redundancies, or to prove correct various 
processing strategies or optimisation techniques XSLT implementations may 
use. 

A formal tree transformation model denoted by TL, in part inspired by 
XSLT, but still omitting many of its features, has already been studied by 
Maneth and his collaborators [31 E]. The TL model can be compiled into 
the earlier formalism of "macro tree transducers" jl2| \2'A\ . It is certainly 
an interesting topic for further research to similarly translate our XSLT for- 
malisation (even partially) into macro tree transducers, so that techniques 
already developed for these transducers can be applied. For example, un- 
der regular expression types ^Sl (known much earlier under the name of 
"recognisable tree languages"), exact automated typechecking is possible 
for compositions of macro tree transducers, using the method of "inverse 
type inference" [10] . This method has various other applications, such as 
deciding termination on all possible inputs ^HI- Being able to apply this 
method to our XSLT 1.0 formalism would improve the analysis techniques 
of Dong and Bailey which are not complete. 
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A Real XSLT programs 
A.l Figure HOI in real XSLT 



<xsl : transform 

xmlns : xsl="http : / / www . w3 . org/1999/XSL/Transf orm" 
version=" 1 . 0"> 

<xsl : template naine="tree2string" match="//*"> 
<a/> 

<lbrace/> 

<xsl : apply-templates select=" child: :*"/> 
<rbrace/> 
</xsl : template> 

</xsl :transf orm> 

A. 2 Figure [TT] in real XSLT 

<xsl : transform 

xmlns : xsl="http : //www . w3 . org/1999/XSL/Transf orm" 
version=" 1 . 0"> 

<xsl :template match="/doc"> 

<xsl : apply-templates select="child: :*[l]"/> 
</xsl :template> 

<xsl :template name="string2tree" match="/doc//*"> 
<a> 

<xsl : apply-templates select="f ollowing-sibling : mode="dochildren"/> 
</a> 

<xsl : call-template name="searchnextsibling"> 

<xsl:with-param name=" counter" select="l"/> 
</xsl : call-template> 
</xsl : template> 

<xsl :template match="//*" mode="dochildren"> 
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<xsl:if test="naiiie() = 'lbrace' "> 

<xsl : apply-templates select="f ollowing-sibling: :*[1] " mode="dochildren"/> 
</xsl:if> 

<xsl:if test="iiame() = 'a"'> 

<xsl : call-template naiiie="string2tree"/> 
</xsl:if> 
</xsl : template> 

<xsl : template name="searclmextsibling" match="//*" mode="search"> 
<xsl:param name=" counter" /> 
<xsl:if test="name()='lbrace' "> 

<xsl : apply-templates select="f ollowing-sibling: mode="search"> 

<xsl : with-param name=" counter" select="$coiuiter + l"/> 
</xsl : apply-templates> 
</xsl:if> 

<xsl:if test="name()='a' "> 

<xsl : apply-templates select="f ollowing-sibling: :*[1]" mode="search"> 
<xsl : with-param name="coiuiter" select="$coiuiter"/> 

</ xsl : apply-templates> 
</xsl:if> 

<xsl:if test="name()='rbrace' "> 
<xsl:if test="$counter=2"> 

<xsl : apply-templates select="f ollowing-sibling: :*[!]" mode="dochildren"/> 
</xsl: if > 

<xsl:if test="$counter>2"> 

<xsl : apply-templates select="f ollowing-sibling: :*[!]" mode="search"> 

<xsl : with-param name="coiuiter" select="$coiuiter - l"/> 
</xsl : apply-templates> 
</xsl:if> 
</xsl:if> 
</xsl : template> 

</xsl : transf orm> 
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